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Megahit (v.1.2.9), Database 


igs using 


Table 1. Top 50 abundant assembled cont 


Abundance} GenBank accession Subect title 


| Contig ID | Contig length 


E-value 


|_Bitscore_| 


Conti 
k141_ 11989 |29.802 ————*[120.341  |MN9080473 = sd Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome 100,000 0,000E+00 | 55.033 | 
irst_s1989 f2a.00 ——freose | 0K372407.1 Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/TN-VUMC- 100,000 0,000E+00 | secs | 

000018/2020, complete genome 
k141_11989 |29.802 =| 120.341 ——-|[MG772933.1_——__—|[Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome 0,000E+00 | 26.943 _| 
k141_11989 |29.802 =| 120.341 ~—([AY274119.3. [SARS coronavirus Tor, complete genome 0,000E+00 | 15.175 _| 
k141_11989 [29.802 =| 120.341 ——‘([AY278741.1_—_[SARS coronavirus Urbani, complete genome 0,000E+00 | 15.169 _| 
k141_5255 [5.659 ——*[290.188 ~~ |cP040006.1_ ~=~=~—~—_—|Schaalia odontolytica strain XH001 chromosome, complete | 5.115 | 0,000E+00[ 8.379 | 
_ 213.744 CP012410.1 Leptotrichia sp, oral taxon 212 strain W10393, complete genome 0,000E+00 8.059 
a 933.303 CP003667.1 Prevotella sp, oral taxon 299 str, F0039 plasmid, complete sequence 0,000E+00 3.083 
k141_14692 |3.364 313.495 CP040504.1 Neisseria sp, oral taxon 014 str, F0314 chromosome 0,000E+00 4.357 
k141_631 [3.014 2.200.770 ‘|LR7781741 si Veillonella parvula strain SKV38 genome assembly, chromosome: 1 | 2.844 | 0,000E+00| 4.150 | 
k141_ 11680 [2.964 —s-‘(116.125 — [APO19846.1 Cd Leptotrichia hongkongensis JMUB5056 DNA, complete genome 2.962 0,000E+00 | 5.066 | 
k141_27232 [2811 |1.407.705 [MF 164264.1 i 2.436 | 0,000E+00 | 4.337_| 
k141_ 27232 [2.811 ———([t.407.705 INR 1461171 si Homo sapiens RNA, 45S pre-ribosomal N4 (RNA45SN4), ribosomal RNA | 2.435 | 0,000E+00| 4.335 | 
k141_ 27232 [2.811 ———«‘[.407.705  |cPoes263.2 sid Homo sapiens isolate CHM13 chromosome 15 2.434 0,000E+00 | 4.335 | 
k141_ 27232 2.811 ————=*|4.407.705 [AL353644.3————_— [Human DNA sequence from clone RP11-164K15 on chromosome 22, complete sequence 2.435 | 0,000E+00 | 4.335 __| 
k141_27232 [2.811 ———=*|4.407.705 |NR_046235.3.—_—[Homo sapiens RNA, 45S pre-ribosomal N5 (RNA45SN5), ribosomal RNA 2.435 | 0,000E+00 | 4.331 __—| 
k141_ 16443 |2.626 693.461 |[CP020566.1_ —_[Veillonella atypica strain OK5, complete genome 2.190 | 0,000E+00 | 3.633 _| 
k141_10436 2.373 ‘(425.138 ——[LR134384.1_——_[ Prevotella oris strain NCTC13071 genome assembly, chromosome: 1 | 2390 | 0,000E+00| 3.530 | 
k141_6751 CP068256.2 Homo sapiens isolate CHM13 chromosome 22 0,000E+00 
k141_6751 CP068257.2 Homo sapiens isolate CHM13 chromosome 21 0,000E+00 
k141_6751 CP068263.2 Homo sapiens isolate CHM13 chromosome 15 0,000E+00 
k141_ 6751 AC231275.2 Rel eed FOSMID clone ABC12-46987300E 12 from chromosome unknown, complete 0,000E+00 
k141_6751 [2.309 «(99.053 ~—s|mT497387.1 =i Homo sapiens clone BAC JH13 genomic sequence 2.199 0,000E+00 | 3.986 | 
k141_ 10208 1.812 ———«*([ 1.400.352 |FM996435.1_—~——*[Uncultured bacterium partial 16S rRNA gene, clone 16sps19-1902,pika 1.405 | 0,000E+00 | 2.359 | 
Homo sapiens external transcribed spacer 18S ribosomal RNA gene, internal transcribed spacer 1, 
k141_ 12515 |1.744 184.440 KY962518.1 5,8S ribosomal RNA gene, internal transcribed spacer 2, 28S ribosomal RNA gene, and external 1.626 0,000E+00 2.942 
transcribed spacer, complete sequence 
k141_12515 |1.744 (184.440 |MF164269.1_ | Homo sapiens clone BAC JH1 genomic sequence | 1.626 | 0,000E+00 | 2.942 | 
k141_ 12515 |1.744 ‘| 184.440 [NR 1461441 ———_—[Homo sapiens RNA, 45S pre-ribosomal N2 (RNA45SN2), ribosomal RNA | 1.626 | 0,000E+00| 2.942 | 
k141_12515 |1.744 | 184.440 —[NR_146148.1_—_—[Homo sapiens RNA, 28S ribosomal N2 (RNA28SN2), ribosomal RNA 1.626 | 0,000E+00 | 2942 | 
fe 184.440 NR_145822.1 Homo sapiens RNA, 28S ribosomal N1 (RNA28SN1), ribosomal RNA 1.626 0,000E+00 2.942 

= x 248.289 JQ460207.1 Uncultured bacterium clone 070054_332 16S ribosomal RNA gene, partial sequence 1.287 0,000E+00 2.248 

A k141_ 26154 |1.211 136.391 JQ454767.1 Uncultured bacterium clone 069096_294 16S ribosomal RNA gene, partial sequence 925 0,000E+00 1.679 

= k141_24818 |1.195 ‘(583.961 — uQ4eoi41.1 si Uncultured bacterium clone 070054_143 16S ribosomal RNA gene, partial sequence | 1.178 | 0,000E+00 | 1.637 | 

* k141_ 22555 11.131  ——s—-[232.205 ~—s |cpoo3e67.1 = sd Prevotella sp, oral taxon 299 str, F0039 plasmid, complete sequence | 1.129 | 0,000E+00| 1.735 | 

Se k141. 8265 |1.128 [183.428 ~~ [MN8495151 si Homo sapiens isolate NJ44 haplogroup P1d1 mitochondrion, complete genome 1.124 0,000E+00 | 2.041 | 

to si(kt41_9265 [1.128 [183.428 [MNo98712.1 ns isolate HG3118 haplogroup P1d1 mitochondrion, complete genome 1.124 | 0,000E+00 | 2.041 | 

© {ki41.9265 [1.128 (183.428 ~~ [MN706604.1_——_—[Homo sapiens isolate BachoKiro_ BK_1653 mitochondrion, complete genome 1.124 | 0,000E+00 | 2.041 | 

E |ki41_9265 [1.128 (183.428 |MF437277.1___—_—| Homo sapiens isolate 250 mitochondrion, complete genome 1.124 | 0,000E+00 | 2.041 | 

oO k141_8265 [1.128  ——==—*[183.428 — [MK491356.1__—_—[Homo sapiens isolate 2 Mu mitochondrion, complete genome 1.124 0,000E+00 | 2.041 | 

= {ki41_11940 |1.097————«*(508.437_[CP072350.1 __[Prewotella melaninogenica strain F0301 chromosome 2, complete sequence 1.097 | 0,000E+00 | 1.999 | 

> {k141_10110 [1.043 [85.091 [cP072347.1 ——_|Prevotella melaninogenica strain F0516 chromosome 2, complete sequence | 882 | 0,000E+00[ 1.206 | 

o k141_5437 [1.007 1.295.606 _|CP072360.1 Prevotella melaninogenica strain F0091 chromosome 1, complete sequence 98,512 1.008 0,000E+00 1.777 

= k141_ 20271 |992 188.951 CP023863.1 Prevotella jejuni strain CD3:33 chromosome |, complete sequence 97,414 580 0,000E+00 985 


Contig length | Abundance| GenBank accession Length | E-value | Bitscore 


k141_4074 ]o17—ssss«*d'1 74.529 [CP068256.2 97,686 778 0,000E+00 | 1.338 
ki41 4074 |9i7—Ss«d'4 74.529 [CP068257.2 97,686 778 0,000E+00 | 1.338 
Ik141_4074 [917 ——s*i't74.529 ~~ [CP068263.2 97,686 778 0,000E+00 | 1.338 
k141_4074 ]oi7—sss«*('474.529 [CP 086022.1 97,452 785 0,000E+00 | 1.338 
ki41 4074 917 —Ss«*('474.529 | AP025035.1 97,452 785 0,000E+00 | 1.338 


ki41 5448 [913 «78.604 CP024724.1 91,730 919 0,000E+00 | 1.264 
k141_ 24026 |850 CP085934.1 Prevotella copri DSM 18205 strain FDAARGOS_1573 97,291 406 | 0,000E+00 | 688 
ki41 4059 |812 «(432.074 [CP072363.1 98,684 532 0,000E+00 942 
k141.525 [754 —s«(247.201 ——‘|CP072333.1 98,802 501 0,000E+00 891 


ki41.9586 [656 ————*[762.405 —_ [CP016205.1 Prevotella scopos JCM 17725 strain W2052 chromosome 2 98,176 658 0,000E+00 | 1.147 
k141_ 19969 MT242596.1 100,000 477 0,000E+00 881 
k141_ 19969 OL521838.1 100,000 477 0,000E+00 881 
k141_ 19969 OK104093.1 100,000 477 0,000E+00 881 


k141_ 19969 OK266950.1 100,000 477 0,000E+00 881 
k141_ 19969 |639 OK239657.1 100,000 477 0,000E+00 881 
ki41.534 |e20 —s—[199.625 —_ [CP072361.1 Prevotella in FO054 chromosome 1, complete sequence 98,387 620 0,000E+00 | —_ 1.090 
ki41 7697 [569  ~——«*([360.511 —_ [CP023863.1 Prevotella jejuni strain CD3:33 chromosome |, complete sequence 96,473 567 0,000E+00 935 
k141_ 11094 |532 CP072331.1 Prevotella veroralis strain F0319 chromosome 2, complete sequence 97,543 529 0,000E+00 904 


k141_ 17668 CP085941.1 Prevotella melaninogenica strain FDAARGOS_1567 chromosome 2, complete sequence 99,018 509 0,000E+00 911 
k141_11371 |427 CP072360.1 Prevotella melaninogenica strain F0091 chromosome 1, complete sequence 99,532 427 0,000E+00 778 
k141_ 9606 |408 63.287 CP072330.1 Prevotella veroralis strain F0319 chromosome 1, complete sequence 98,529 408 0,000E+00 721 


k141_25754 MN297237.1 Homo sapiens LHRI_LNC32,2 IncRNA gene, complete sequence 100,000 256 4,700E-129 473 
k141_25754 MN297236. 1 Homo sapiens LHRI_LNC32,1 IncRNA gene, complete sequence 100,000 256 4,700E-129 473 
k141_ 25754 CP034492.1 Eukaryotic synthetic construct chromosome 14 100,000 256 4,700E-129 473 
k141_ 25754 NG_050638.2 Homo sapiens ribosomal protein S29 (RPS29), RefSeqGene (LRG_1147) on chromosome 14 100,000 256 4,700E-129 473 
k141_25754 XR_001750762.1 PREDICTED: Homo sapiens uncharacterized LOC 107987206 (LOC107987206), ncRNA 100,000 256 4,700E-129 473 
k141_ 13347 JQ470050.1 Uncultured bacterium clone 071024 066 16S ribosomal RNA gene, partial sequence 100,000 351 0,000E+00 649 
k141_ 17635 LT677940. 1 a melaninogeni ial 16S rRNA gene, isolate 219N_3354 100,000 335 4,900E-173 619 
k141_ 14693 CP023863.1 a jejuni strain CD3:33 chromosome |, complete sequence 98,160 326 4,910E-158 569 
k141_21608 |324 MW717453.1 KCOM 3945 16S ribosomal RNA gene, partial sequence 100,000 324 6,150E-167 599 
k141_1252 [321 [182.075 —_|CP023863.1 jejuni strain CD3:33 chromosome |, complete sequence 99,688 321 1,320E-163 | _ 588 
k141_10440 CP003667.1 xon 299 str, F0039 plasmid, complete sequence 99,686 318 6,070E-162 582 
k141_ 14695 CP023863.1 jejuni n CD3:33 chromosome |, complete sequence 99,371 318 2,820E-160 577 
k141_ 10440 AP024484.1 1 DNA, complete genome 94,688 320 2,930E-135 494 
ki41 2510 [286  —————-[83128 CP023864.1 jejuni strain CD3:33 chromosome ll, complete sequence 100,000 286 | 7,110E-146| 529 


k141_ 19087 LR778174.1 arvula st 100,000 281 4,200E-143| 520 
k141.6152 [245 [186296 CP023864.1 jejuni strain CD3: 99,190 247 | 2,220E-120| 444 
ki41 5974 [234 ~=—*([421016 LC358497.1 Uncultured bacterium 62MG04014 gene for 16S rRNA, partial sequence 100,000 234 | 4,550E-117| 433 
k141_ 17500 |232 MF801036. 1 red bacterium clone saliva72 16S ribosomal RNA gene, partial sequence 100,000 232 5,820E-116 429 


k141_ 17824 |208 CP072331.1 a veroralis strain F0319 chromosome 2, complete sequence 98,387 186 1,930E-85 327 
k141_7700 {206 CP085941.1 a melaninogenica strain FDAARGOS_1567 chromosome 2, complete sequence 100,000 206 1,440E-101 381 


Megahit (v.1.2.9), Database 


igs using 


Table 2. Top 50 longest assembled cont 


query from 20.12.2021 


Contig ID | Contiglength | GenBankaccession | Subtitle = Identity (%) | Length | E-value 
k141_11989 [29.802 —«([MN908947.3_—_—| Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome | 100,000 | 29.801 | 0,000E+00 
ioa_s1aee onan fowsrawwrs | Severe acute respiratory syndrome coronavirus 2 isolate SARS-CoV-2/human/USA/TN-VUMC- } 100000] zaa01 0,000E+00 

000018/2020, complete genome 

k141_11989 |29.802 ———|[MG772933.1 —_—[Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome | 89,120 | 28.464 | 0,000E+00 
k141_11989 |29.802 = [AY274119.3. [SARS coronavirus Tor2, complete genome | 82,300 | 26.631 | 0,000E+00 
k141_11989 |29.802 ————|[AY278741.1_—_—[SARS coronavirus Urbani, complete genome | 82,300 | 26.632 | 0,000E+00 
k141_7303 [16.036 __—[AP019846.1_ ——_[Leptotrichia hongkongensis JMUB5056 DNA, complete genome | 99,183 | 16.035 _| 0,000E+00 
k141_20796 13.656 —«([LR778174.1_—_[Veillonella parwula strain SKV38 genome assembly, chromosome: 1 | 99,817 | 13.653 | 0,000E+00 
k141_21803 [11.776 -([CP0124101 Leptotrichia sp, oral taxon 212 strain W10393, complete genome | 99,236 | 11.777 | 0,000E+00 
ki4t_to4a4 |a.633 | |No significant similarity found. ae re 

k141_19747 |8.210 ss [CP0723591 sd a melaninogenica strain F0091 chromosome 2, complete sequence | 97,381 | 8.209 | 0,000E+00 
ki4t_to7e7 [7.584 | No: ignificant similarity found. eon raceme 

k141_ 21891 [7.432 = [cP072330.1 a veroralis strain F0319 chromosome 1, complete sequence | 98,145 | 7.333 | 0,000E+00 
k141_17952 [7.169 —«[CP0e8257.2 iens isolate CHM13 chromosome 21 | 98,558 | 5.825 | 0,000E+00 
k141_17952 (7.169 —s«([FP236383.1 uence from clone CH507-528H12 on chromosome 21, complete sequence | 98,558 | 5.826 | 0,000E+00 
k141_17952 (7.169 ss [MF 164268.1 i one BAC JH12 genomic sequence | 98,558 | 5.824 | 0,000E+00 
k141_17952 |7.169 —s|[MF1642651 sd iens clone BAC JH6 genomic sequence | 98,558 | 5.824 | 0,000E+00 
k141_17952 |7.169 ——s|[MF164263.1 iens clone BAC JH3 genomic sequence | 98,558 | 5.824 | 0,000E+00 
ki41_798 (7.150 ——‘([CP07233901 = is strain F0319 chromosome 1, complete sequence | 97,929 | 7.147 | 0,000E+00 
k141_19768 [7.024 ———-[CP072330.1 Ss is strain F0319 chromosome 1, complete sequence | 93.910 | 7.028 | 0,000E+00 


| 98,992 | 5.856 _| 0,000E+00 
| 98,890 | 5.856 _| 0,000E+00 
| 98,889 | 5.853 _| 0,000E+00 
| 98,873 | 5.856 _| 0,000E+00 
| 98,824 | 5.865 | 0,000E+00 


k141_13219 [5.924 | Mzogedgo.t 
ki41_13219 [5.924 |Mzog23ot.t 
ki41_13219 [5.924 |AC008496.6_ 
ki41_ 13219 [5.924 |zog2335.1 
ki41_ 13219 [5.924 |CPosa4on.1 


k141_2458 [5.854 ———-[CP0723301 = la veroralis strain F0319 chromosome 1, complete sequence | 95,914 | 5.825 | 0,000E+00 
ki41_6967 [5.802 | SSS—Nov Significant similarity found. aay aes) 

k1415255 [5659  ~—«([CP040008.1 SY Schaalia odontolytica strain XH001 chromosome, complete genome | 96,364 | 5.115 | 0,000E+00 
k141_ 12253 [5.414 ss [CP012410.1 Leptotrichia sp, oral taxon 212 strain W10393, complete genome | 96,229 | 4.932 | 0,000E+00 
k141_ 10094 [5.338 == [CP072330.1 ss a veroralis strain F0319 chromosome 1, complete sequence | 98,782 | 5.336 | 0,000E+00 
k141_10156 |5.173 ————«|[CP072330.1_ ——_[Prevotella veroralis strain F0319 chromosome 1, complete sequence | 97,969 | 5.169 | 0,000E+00 
ki41 10855 [4.989 —«([CP072345.1 ~——si(Prevote inogenica strain FO692 chromosome 1, complete sequence | 95,516 | 2.788 | 0,000E+00 
k141_17480 [4.869  —«|[CP072331.1—_—| Prevotella veroralis strain F0319 chromosome 2, complete sequence | 98,122 | 4.792 | 0,000E+00 
k141_12356 [4.772  —-(LR1343841 Prevotella oris strain NCTC13071 genome assembly, chromosome: 1 | 87,967 | 4.712 | 0,000E+00 
k141_19351 [4.750 [CP0es990.1 Veillone' ja strain FDAARGOS_1046 chromosome, complete genome | 95,833 | 48 | 2,130E-08 
k141_16413 [4.712 ———«|[CP001650.1_ ——~—_[Zunongwangia profunda SM-A87, complete genome | 100,000 | 29 | 9,900E-02 
k141_5355 [4.708 ——«[LR778174.1 ————__—[Veellone’ a strain SKV38 genome assembly, chromosome: 1 | 99,745 | 4.708 | 0,000E+00 
k141_27644 4.414 ————(|[CP072330.1 ——_[Prevotella veroralis strain F0319 chromosome 1, complete sequence | 97,281 | 4.413 | 0,000E+00 
k141_ 23318 [4.389 ——«[CP072331.1 —_| Prevotella veroralis strain F0319 chromosome 2, complete sequence | 98,313 | 4.387 | 0,000E+00 
k141_12081 [4.300 ——«[CP072330.1_——_| Prevotella veroralis strain F0319 chromosome 1, complete sequence | 97,836 | 4.298 | 0,000E+00 
ki413691 [4291 = [LR7781741 Veillone' la strain SKV38 genome assembly, chromosome: 1 | 99,790 | 4.291 | 0,000E+00 


Bitscore 
55.033 


55.033 


26.943 
15.175 
15.169 
28.884 
25.074 
21.246 


13.967 


12.787 
10.277 
10.277 
10.275 
10.275 
10.275 
12.379 
10.599 
10.484 
10.447 
10.447 
10.447 
10.445 
9.415 


8.379 
8.059 
9.492 
8.964 
4.457 
8.346 
5.542 
77 
55 
8.628 
7.483 
7.692 
7.422 
7.875 


Contig ID | Contig length | GenBank accession Subect title Identity (%) | Length E-value Bitscore 
k141_ 3454 [4.225 } No significant similarity found. | 
k141_9806 [4.222 CP072364.1 Prevotella jejuni strain FO697 chromosome 2, complete sequence 4.222 0,000E+00 7.559 
k141_7015 /4.133 CP072330.1 Prevotella veroralis strain F0319 chromosome 1, complete sequence 4.130 0,000E+00 7.328 
k141_7976 |4.088 CP072365.1 Prevotella jejuni strain FO106 chromosome 1, complete sequence 4.085 0,000E+00 7.095 
k141_ 5351 [4.084 10U452294.1~—s| Pammene fasciana genome assembly, chromosome: 22 | 36 ‘| 8,600E-02 55 
k141_ 16288 |4.068 MZ824237.1 Reagent-associated CRESS-like virus 1 isolate 7 replicase-like gene, partial sequence 3.878 0,000E+00 7.095 
k141_ 14229 |4.054 CP072331.1 Prevotella veroralis strain F0319 chromosome 2, complete sequence 3.995 0,000E+00 7.031 
k141_ 25910 [4.028 ICP072330.1—~—_| Prevotella veroralis strain F0319 chromosome 1, complete sequence | 4.025 | 0,000E+00 | 7.112 
k141_ 1855 [3.995 ICP072331.1—~—_| Prevotella veroralis strain F0319 chromosome 2, complete sequence | 3.994 | 0,000E+00 | 6.780 
k141_ 18965 |3.971 XM_045317038.1 PREDICTED: Mercenaria mercenaria uncharacterized LOC 123534692 (LOC123534692), mRNA 100,000 29 8,300E-02 55 
k141_11609 {3.889 CP072330.1 Prevotella veroralis strain F0319 chromosome 1, complete sequence 98,277 3.889 0,000E+00 6.811 
k141_7725 |3.861 } No significant similarity found. | 
k141_4971 [3.759 CP019721.1 Veillonella parwla strain UTDB1-3, complete genome 98,857 3.761 0,000E+00 6.706 
k141_8774 |3.756 AP019846. 1 Leptotrichia hongkongensis JMUB5056 DNA, complete genome 97,815 3.753 0,000E+00 6.473 
k141_ 21871 [3.723 CP072331.1 Prevotella veroralis strain F0319 chromosome 2, complete sequence 100,000 30 2,200E-02 57 
k141_4917 [3.703 ICP072330.1—~—_| Prevotella veroralis strain F0319 chromosome 1, complete sequence | 3.678 | 0,000E+00 | 6.329 
k141_ 12306 |3.688 MW046375.1 Phoenicopteridae parvo-like hybrid virus isolate par083par024 genomic sequence |_| 0,000E+00 | 5.018 
k141_3180 {3.634 No significant similarity found. 
k141_24646 |3.624 IcP072331.1 si Prevotella veroralis strain F0319 chromosome 2, complete sequence | 3.623 | 0,000E+00 6.408 
k141_ 10404 |3.536 CP019721.1 Veillonella parwla strain UTDB1-3, complete genome 3.536 0,000E+00 6.429 
k141_ 14387 |3.526 CP003667.1 Prevotella sp, oral taxon 299 str, F0039 plasmid, complete sequence 2.100 0,000E+00 3.083 
k141_5501 {3.498 CP072330.1 Prevotella veroralis strain F0319 chromosome 1, complete sequence 3.495 0,000E+00 5.945 


Table 3. Reference sequences used in this study. 


number 


Bronchoalveolar lavage fluid (human) 

Intestinal tissues (bat) 

| Bat-SARS-CoV Short | MG772933_short* | 6.420 | 
Serum (human) 


Zika NC_035889.1 40.808 Placenta, lungs, heart, skin, spleen, thymus, 
liver, kidneys, and cerebral cortex (human) 


AF266291.1 15.894 
KJ410048. 1 15.894 _|Throat swab (human), Vero-hSLAM cells 


Anarene 

SARS-tor AY274119.3 29.751 
Oral and rectal swabs, and whole blood (animal) 
“) Name of FASTA-File. 


Table 4. Results of the consensus sequence analyses 


Error rate in % 


: ‘ Number} Minimum length Minimum identity Number Proportion of | Number — Longest Bokrate nie related 
ID Library Name Accession Length mapped in % (M1) in % (M2) selected reads mapped reads | Contigs contig related to to reference 
reads to the contigs (R1) sequence length 
(Re) 

1 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 47 0,50 264.281 55,09% 1 29.903 0,00% 0,00% 
2 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 37 0,60 227.010 47,32% 1 29.903 0,00% 0,00% 
3 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 32 0,60 262.885 54,80% 1 29.903 0,00% 0,00% 
4 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 30 0,60 270.025 56,29% 1 29.903 0,00% 0,00% 
5 SRR10971381 _SARS-CoV-2 MN908947 29.903 479.694 25 0,62 233.537 48,68% 1 29.903 0,00% 0,00% 
6 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 47 (max. 100) 0,50 131.893 27,50% 1 29.855 19,10% 19,23% 
7 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 37 (max. 100) 0,60 98.776 20,59% 1 29.878 29,90% 29,96% 
8 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 32 (max. 100) 0,60 134.651 28,07% 1 29.885 16,00% 16,05% 
9 SRR10971381 SARS-CoV-2 MN908947 29.903 479.694 30 (max. 100) 0,60 141.791 29,56% 1 29.893 14,10% 14,13% 
10 SRR10971381 _SARS-CoV-2 MN908947 29.903 479.694 25 (max. 100) 0,62 105.604 22,01% 1 29.838 28,60% 28,76% 
11 SRR10971381 Bat-SARS-CoV MG772933 29.802 493.888 47 0,50 274.167 55,51% 1 29.802 9,70% 9,70% 
12 SRR10971381 Bat-SARS-CoV MG772933 29.802 493.888 37 0,60 227.731 46,11% 1 29.802 10,20% 10,20% 
13 SRR10971381 Bat-SARS-CoV MG772933 29.802 493.888 32 0,60 263.686 53,39% 1 29.802 9,70% 9,70% 
14 SRR10971381 Bat-SARS-CoV MG772933 29.802 493.888 30 0,60 270.827 54,84% 1 29.802 9,60% 9,60% 
15 SRR10971381 _Bat-SARS-CoV MG772933 29.802 493.888 25 0,62 234.553 47,49% 1 29.802 10,20% 10,20% 
16 SRR10971381 Bat-SARS-CoV_Short | MG772933_short 6.420 197.266 47 0,50 64.677 32,79% 1 6.410 11,70% 11,84% 
17 SRR10971381 Bat-SARS-CoV_Short | MG772933_short 6.420 197.266 37 0,60 68.358 34,65% 1 6.414 12,20% 12,28% 
18 SRR10971381 Bat-SARS-CoV_Short | MG772933_short 6.420 197.266 32 0,60 92.334 46,81% 1 6.420 10,70% 10,70% 
19 SRR10971381 Bat-SARS-CoV_Short | MG772933_short 6.420 197.266 30 0,60 97.431 49,39% 1 6.420 10,60% 10,60% 
20 SRR10971381 _Bat-SARS-CoV_Short _MG772933_short 6.420 197.266 25 0,62 81.035 41,08% 1 6.420 12,00% 12,00% 
21 SRR10971381 HIV LC312715.1 8.819 315.060 47 0,50 125.861 39,95% 1 8.802 8,10% 8,28% 
22 SRR10971381 HIV LC312715.1 8.819 315.060 37 0,60 105.035 33,34% 1 8.797 8,60% 8,83% 
23 SRR10971381 HIV LC312715.1 8.819 315.060 32 0,60 140.425 44,57% 1 8.811 2,40% 2,49% 
24 SRR10971381 HIV LCO312715.1 8.819 315.060 30 0,60 147.702 46,88% 1 8.814 2,00% 2,06% 
25 SRR10971381 HIV LC312715.1 8.819 315.060 25 0,62 112.543 35,72% 1 8.814 7,60% 7,65% 
26 SRR10971381 Hepatitis Delta NC_001653.2 1.682 163.002 47 0,50 59.234 36,34% 1 1.647 4,00% 6,00% 
27 SRR10971381 Hepatitis Delta NC_001653.2 1.682 163.002 37 0,60 49.517 30,38% 1 1.656 7,10% 8,54% 
28 SRR10971381 Hepatitis Delta NC_001653.2 1.682 163.002 32 0,60 70.689 43,37% 1 1.677 2,30% 2,59% 
29 SRR10971381 Hepatitis Delta NC_001653.2 1.682 163.002 30 0,60 74.744 45,85% 1 1.677 1,70% 1,99% 
30 SRR10971381 Hepatitis Delta NC_001653.2 1.682 163.002 25 0,62 60.074 36,85% 1 1.677 4,80% 5,08% 
31 SRR10971381  Zika NC_035889.1 10.808 310.070 47 0,50 105.438 34,00% 1 10.759 14,40% 14,79% 
32 SRR10971381  Zika NC_035889.1 10.808 310.070 37 0,60 87.258 28,14% 1 10.767 17,30% 17,61% 
33 SRR10971381  Zika NC_035889. 1 10.808 310.070 32 0,60 126.216 40,71% 1 10.802 5,70% 5,75% 
34 SRR10971381  Zika NC_035889.1 10.808 310.070 30 0,60 133.190 42,95% 1 10.802 4,70% 4,75% 
35 SRR10971381  Zika NC_035889.1 10.808 310.070 25 0,62 100.930 32,55% 1 10.789 13,60% 13,75% 
36 SRR10971381 Measles 1 AF266291.1 15.894 313.628 47 0,50 100.344 31,99% 1 15.818 24,40% 24,76% 
37 SRR10971381 Measles 1 AF266291.1 15.894 313.628 37 0,60 87.565 27,92% 1 15.872 29,20% 29,30% 
38 SRR10971381 Measles 1 AF266291.1 15.894 313.628 32 0,60 123.325 39,32% 1 15.886 11,60% 11,64% 
39 SRR10971381 Measles 1 AF266291.1 15.894 313.628 30 0,60 130.537 41,62% 1 15.886 9,40% 9,45% 
40 SRR10971381 Measles 1 AF266291.1 15.894 313.628 25 0,62 99.632 31,77% 1 15.881 24,20% 24,26% 
41 SRR10971381 Measles 2 KJ410048.1 15.894 304.700 47 0,50 94.216 30,92% 1 15.837 24,80% 25,07% 
42 SRR10971381 Measles 2 KJ410048.1 15.894 304.700 37 0,60 85.392 28,02% 1 15.875 28,70% 28,79% 
43 SRR10971381 Measles 2 KJ410048.1 15.894 304.700 32 0,60 120.754 39,63% 1 15.886 11,40% 11,44% 
44 SRR10971381 Measles 2 KJ410048.1 15.894 304.700 30 0,60 127.848 41,96% 1 15.885 9,50% 9,55% 
45 SRR10971381 Measles 2 KJ410048.1 15.894 304.700 25 0,62 96.089 31,54% 1 15.841 24,50% 24,75% 


Error rate in % 
‘ ; Nubero Minimum length Minimum identity Number Proportion of | Number —_ Longest Pron rate love aban 
ID Library Name Accession Length mapped in % (M1) in % (M2) selected reads mapped reads| Contigs contig related to to reference 
reads to the contigs (R1) sequence length 
R2 

46 SRR10971381 SARS-CoV AY278741.1 29.727 460.238 47 0,50 238.799 51,89% 1 29.727 11,30% 11,30% 
47 SRR10971381 SARS-CoV AY278741.1 29.727 460.238 37 0,60 210.410 45,72% 1 29.727 12,80% 12,80% 
48 SRR10971381 SARS-CoV AY278741.1 29.727 460.238 32 0,60 244.768 53, 18% 1 29.727 11,30% 11,30% 
49 SRR10971381 SARS-CoV AY278741.1 29.727 460.238 30 0,60 251.774 54,71% 1 29.727 11,10% 11,10% 
50 SRR10971381 SARS-CoV AY278741.1 29.727 460.238 25 0,62 215.316 46,78% 1 29.727 12,70% 12,70% 
51 SRR10971381 SARS-tor AY274119.3 29.751 462.518 47 0,50 240.919 52,09% 1 29.751 11,30% 11,30% 
52 SRR10971381 SARS-tor AY274119.3 29.751 462.518 37 0,60 212.884 46,03% 1 29.751 12,80% 12,80% 
53 SRR10971381 SARS-tor AY274119.3 29.751 462.518 32 0,60 247.370 53,48% 1 29.751 11,30% 11,30% 
54 SRR10971381 SARS-tor AY274119.3 29.751 462.518 30 0,60 254.412 55,01% 1 29.751 11,10% 11,10% 
55 SRR10971381 SARS-tor AY274119.3 29.751 462.518 25 0,62 218.011 47,14% 1 29.751 12,80% 12,80% 
56 SRR10971381 Ebola NC_039345.1 19.043 307.532 47 0,50 94.263 30,65% 1 19.043 26,80% 26,80% 
57 SRR10971381 Ebola NC_039345.1 19.043 307.532 37 0,60 84.338 27,42% 5 15.130 30,40% - 
58 SRR10971381 Ebola NC_039345.1 19.043 307.532 32 0,60 118.026 38,38% 1 19.043 16,20% 16,20% 
59 SRR10971381 Ebola NC_039345.1 19.043 307.532 30 0,60 125.008 40,65% 1 19.043 13,80% 13,80% 
60 SRR10971381 Ebola NC_039345.1 19.043 307.532 25 0,62 94.095 30,60% 1 18.808 32,00% 32,84% 
61 SRR10971381 Marburg NC_024781.1 19.114 318.728 47 0,50 111.206 34,89% 1 19.106 19,70% 19,73% 
62 SRR10971381 Marburg NC_024781.1 19.114 318.728 37 0,60 86.021 26,99% 1 19.090 31,60% 31,69% 
63 SRR10971381 Marburg NC_024781.1 19.114 318.728 32 0,60 125.752 39,45% 1 19.108 14,20% 14,23% 
64 SRR10971381 Marburg NC_024781.1 19.114 318.728 30 0,60 133.125 41,77% 1 19.102 12,20% 12,26% 
65 SRR10971381 Marburg NC_024781.1 19.114 318.728 25 0,62 100.037 31,39% 1 19.107 30,20% 30,23% 
66 SRR10971381 = Rnd-Uniform rnd_uniform 29.903 333.742 47 0,50 107.283 32,15% 18 3.904 36,30% - 
67 SRR10971381 = Rnd-Uniform rnd_uniform 29.903 333.742 37 0,60 91.049 27,28% 18 4.509 38,60% - 
68 SRR10971381 = Rnd-Uniform rnd_uniform 29.903 333.742 32 0,60 126.231 37,82% 1 29.753 33,90% 34,23% 
69 SRR10971381 = Rnd-Uniform rnd_uniform 29.903 333.742 30 0,60 133.205 39,91% 1 29.794 30,50% 30,75% 
70 SRR10971381 __ Rnd-Uniform rnd_uniform 29.903 333.742 25 0,62 98.636 29,55% 18 4.467 33,20% - 
71 SRR10971381 — Rnd-Wuhan rnd_wuhan 29.903 313.606 47 0,50 104.960 33,47% 19 3.428 32,30% - 
72 SRR10971381 = Rnd-Wuhan rnd_wuhan 29.903 313.606 37 0,60 82.914 26,44% 19 1.691 36,20% - 
73 SRR10971381 — Rnd-Wuhan rnd_wuhan 29.903 313.606 32 0,60 118.051 37,64% 1 28.937 33,60% 35,75% 
74 SRR10971381 = Rnd-Wuhan rnd_wuhan 29.903 313.606 30 0,60 125.287 39,95% 1 29.684 30,30% 30,81% 
75 SRR10971381__ Rnd-Wuhan rnd_wuhan 29.903 313.606 25 0,62 93.120 29,69% 17 1.470 34,40% - 
76 SRR10971381 = Rnd-MK-1 rnd_wh_mk_1 29.903 327.202 47 0,50 112.718 34,45% 1 29.782 32,30% 32,57% 
77 SRR10971381 = Rnd-MK-1 rnd_wh_mk_1 29.903 327.202 37 0,60 87.402 26,71% 13 3.899 37,70% - 
78 SRR10971381 = Rnd-MK-1 rnd_wh_mk_1 29.903 327.202 32 0,60 126.963 38,80% 1 29.850 25,70% 25,83% 
79 SRR10971381 = Rnd-MK-1 rnd_wh_mk_1 29.903 327.202 30 0,60 134.450 41,09% 1 29.850 22,60% 22,74% 
80 SRR10971381 _ Rnd-MK-1 rnd_wh_mk_1 29.903 327.202 25 0,62 99.369 30,37% 3 16.838 40,10% - 
81 SRR10971381 Rnd-MK-2 rnd_wh_mk_2 29.903 328.524 47 0,50 116.586 35,49% 1 29.793 34,10% 34,34% 
82 SRR10971381 — Rnd-MK-2 rnd_wh_mk_2 29.903 328.524 37 0,60 91.824 27,95% 19 4.856 35,30% - 
83 SRR10971381 Rnd-MK-2 rnd_wh_mk_2 29.903 328.524 32 0,60 130.868 39,84% 1 29.881 27,20% 27,25% 
84 SRR10971381 —Rnd-MK-2 rnd_wh_mk_2 29.903 328.524 30 0,60 138.261 42,09% 1 29.881 24,20% 24,26% 
85 SRR10971381 _ Rnd-MK-2 rnd_wh_mk_2 29.903 328.524 25 0,62 103.219 31,42% 16 8.146 35,20% - 


Table 5. List of software and commands used in this study. 


Software 


SRA Toolkit 


Fastp 


Megahit 


BBMap 


BBMap 


BWA 


Bowtie2 


Samtools 


Commands 


fastq-dump split-files -- 
origfmt --gzip 
SRR10971381 


fastp -i 
SINRILOS PALS l I eeSee vez, ll 
SNRLOG VAS 2 oes Gz —C 
SNRLOSTILS Le wesc; —C 
SIRURILO)/ STIL SKSHIL 2 «ae SHELe| 


3 


egahit 

-1 SRR10971381 1.fastq -2 
SRR10971381 2.fastq -o 
egahit result 


3 


mapPacBio.sh 
in=SRR10971381 1.fastq 
MAS SRR O Oaks Sige tetel Sits c 
outm=mapped.sam vslow k=8 
maxindel=0 minratio=0.1 


reformat.sh in=mapped.sam 
out=sample selection.sam 
minlength=$M1 
(maxlength=100) 
idfilter=SM2 ow=t 


bwa mem refernce.fasta 
Sleft.fastq Sright.fastq 
> out.sam 


bowtie2 -x cov -l 
RR10971381 1.fastq -2 
RR10971381 2.fastq --no- 
unal -p 12 -S 

sample final.sam 


ep) 


Nn 


samtools view -b 
sample selection.sam > 
sample.bam 


samtools sort sample.bam 
=0) Semis Swe sweevels) <loeun 


Version 


2.8.0 


0.23.1 


1.2.9 


38.93 


38.93 


0.7.17- 
r1188 


2.4.4 


Analysis 


Download 
SRA files 


FASTQ 
Preprocessing 


De novo 
Assembly 


Alignment of 
short reads 


Selection of 
short reads 


Alignment of 
short reads 


Alignment of 
short reads 


Analysis of 
sam/bam file 


Citation 


(SRA Toolkit 
Development Team, 
no date) 


(Chen et. al., 2018) 


(Li et al. , 2015) 


(Bushnell, 2014) 


(Bushnell, 2014) 


(Li, 2013) 


(Langmead et. al., 
2018) 


(Li et al. , 2009) 


Samtools, 
bcftools 


Seqtk 


samtools index 
sample sort _reads.bam 


samtools mpileup -uf 
mapping/Sreference.fasta 
sample sort _reads.bam | 
bcftools call -c | 
vefutils.pl vcf2fq > 

SAMPLE cns.fastq 


segqtk seq -aQ64 -q20 -n N 


sample cns.fastq > 
sample cns.fasta 
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1.14 


1.3- 
r106 


Consensus 
sequence 


Convert to 
FASTA and 
quality 
control 


(Li et al. , 2009) 


(Shen, 2016) 
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145,56 @ Read length 
0,00486776 P(Covering a nucleotide) 

592, 7907 EN (Expected coverage) 
589,9052 VARN (Binomial distribution) 

29.903 Covered nucleotides 
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Error rate in% 


Figure 1: Reference MN908947.3. a) MN908947_reads mapped with Bowtie2 using default settings. 
b) MN908947_ primer mapped using BBMap. c) Quantiles were determined from EN and VARN under 


the distribution hypothesis of a binomial distribution. d) The 26 primer pairs ([1], Supplementary Table 


8. PCR primers used in this study.) are evenly distributed across the entire reference genome. The 


primer positions correlate with areas of high nucleotide coverage. 
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Reference: MN908947.3 
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Nuc 
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15.000 


—— Exponential distributed coverage 


27.500 


“AN 


20.000 30.000 


22.500 25.000 


17.500 


eotide position 


MN908947_short_reads 


Reference - MN908947.3 - Short reads 


Genome length 29.903 Genome length 29.903 
Number of reads 121.779 Number of reads 59.949 
@ Read length 145,56 @ Read length 46,24 
P(Covering a nucleotide) 0,00486776 P(Covering a nucleotide) 0,00154643 
EN (Expected coverage) 592, 7907 Lambda 0,01078668 
VARN (Binomial distribution) 589,9052 EN (Expected coverage) 92,7070 
Covered nucleotides 29.903 VARN (Exponential distribution) 8.595 
Coverage in % 100,00% VARN (Trimmed 99,5%) 19.129 

Covered nucleotides 29.903 

Coverage in % 100,00% 


Figure 2: Reference MN908947.3. a) MN908947_ reads mapped with Bowtie2 using default settings. 
b) MN908947_short_reads mapped using BBMap, (M1; M2) = (37 (max. 100); 0.60). c) Exponential 
distributed coverage was generated by stochastic simulation using the inversion method. The coverage 


distribution MN908947_short_reads show a more random pattern, but has a higher trimmed variance. 


This is mainly due to the few swings in the coverage distribution. 
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Reference: MG772933.1 
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0 
Nucleotide position 


—— MG772933_primer =—=MG772933_reads =——=MG772933_short_reads 


Reference - MG772933.1 
Genome length 29.802 Genome length 

Number of reads 50.722 Number of reads 

@ Read length 146,80 @ Read length 

P(Covering a nucleotide) 0,00492572 P(Covering a nucleotide) 

EN (Expected coverage) 249, 8422 EN (Expected coverage) 

VARN (Binomial distribution) 248,6115 VARN (Binomial distribution) 

Covered nucleotides 22.684 Covered nucleotides 

Coverage in % 76,12% Coverage in % 

Error rate in % 11,50% 


Reference - MG772933.1 - short reads 
Genome length 29.802 
Number of reads 183.727 
@ Read length 110,56 
P(Covering a nucleotide) 0,00370972 
EN (Expected coverage) 681,5748 
VARN (Binomial distribution) 679,0464 


Covered nucleotides 29.802 
Coverage in % 100,00% 
Figure 3: Reference MG772933.1. a) MG772933_reads mapped with Bowtie2 using default settings. 


b) MG772933_short_reads mapped using BBMap, (M1; M2) = (37; 0.60). c) MG772933_ primer mapped 
using BBMap. d) The coverage distribution in a) covers 76.12% of the reference sequence MG772933.1. 


Complete coverage is achieved with the sequences in b). The error rates of the calculated consensus 


sequences (Table 4, 11-20) show error rates of about 10% in agreement with the presentation in [1]. 
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Reference: AY278741.1 
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Nucleotide position 


—— AY278741.1_primer = ———AY278741.1_reads —— AY278741.1_short_reads 


Reference - AY278741.1 
Genome length 29.727 Genome length 

Number of reads 11.332 Number of reads 

@ Read length 145,77 @ Read length 

P(Covering a nucleotide) 0,00490377 P(Covering a nucleotide) 

EN (Expected coverage) 55,5695 EN (Expected coverage) 

VARN (Binomial distribution) 55,2970 VARN (Binomial distribution) 
Coverage in % 28,21% Coverage in % 


Error rate in % 21,50% 


Reference - AY278741.1 - short reads 
Genome length 29.727 
Number of reads 168.076 
@ Read length 113,91 
P(Covering a nucleotide) 0,00383173 
EN (Expected coverage) 644,0224 
VARN (Binomial distribution) 641,5547 


Covered nucleotides 29.727 
Coverage in % 100,00% 
Figure 4: Reference AY278741.1. a) AY278741.1_reads mapped with Bowtie2 using default settings. 
b) AY278741.1_short_reads mapped using BBMap, (M1; M2) = (37; 0.60). c) AY278741.1_primer 


mapped using BBMap. d) The coverage distribution in a) covers 28.21% of the reference sequence 


AY278741.1. Complete coverage is achieved with the sequences under b). The error rates of the 


calculated consensus sequences (Table 4, 46-50) show error rates of about 12.8%. 
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—— AY274119.3_ primer ———AY274119.3_ reads —— AY274119.3_short_reads 


Reference - AY274119.3 
Genome length 29.751 Genome length 

Number of reads 11.419 Number of reads 

@ Read length 144,84 @ Read length 

P(Covering a nucleotide) 0,00486833 P(Covering a nucleotide) 

EN (Expected coverage) 55,5915 EN (Expected coverage) 

VARN (Binomial distribution) 55,3208 VARN (Binomial distribution) 
Coverage in % 28,22% Coverage in % 


Error rate in % 21,50% 


Reference - AY274119.3 - short reads 
Genome length 29.751 
Number of reads 170.197 
@ Read length 112,92 
P(Covering a nucleotide) 0,00379535 
EN (Expected coverage) 645,9568 
VARN (Binomial distribution) 643,5051 


Covered nucleotides 29.751 
Coverage in % 100,00% 
Figure 5: Reference AY274119.3. a) AY274119.3_ reads mapped with Bowtie2 using default settings. 
b) AY274119.3_ short_reads mapped using BBMap, (M1; M2) = (87; 0.60). c) AY274119.3_ primer 


mapped using BBMap. d) The coverage distribution in a) covers 28.22% of the reference sequences 


AY274119.3. Complete coverage is achieved with the sequences under b). The error rates of the 


calculated consensus sequences (Table 4, 51-55) show error rates of about 12.8%. 
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Figure 6: Reference LC312715.1. a) LC312715.1_short_reads mapped using BBMap, (M1; M2) = (37; 
0.60). b) LC312715.1_primer mapped using BBMap. 
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Figure 7: Reference NC_001653.2. a) NC_001653.2_short_reads mapped using BBMap, (M1; M2) = 
(37; 0,60). b) NC_001653.2_primer mapped using BBMap. 
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Figure 8: Reference NC_035889.1. a) NC_035889.1_short_reads mapped using BBMap, (M1; M2) = 
(37; 0,60). b) NC_035889.1_ primer mapped using BBMap. 
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Figure 9: Reference AF266291.1. a) AF266291.1_short_reads mapped using BBMap, (M1; M2) = (87; 
0,60). b) AF266291.1_ primer mapped using BBMap. 
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Figure 10: Reference KJ410048.1. a) KJ410048.1_short_reads mapped using BBMap, (M1; M2) = (87; 
0,60). b) KJ410048.1_ primer mapped using BBMap. 
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Figure 11: Reference NC_039345.1. a) NC_039345.1_short_reads mapped using BBMap, (M1; M2) = 
(37; 0,60). b) NC_039345.1_primer mapped using BBMap. 
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Figure 12: Reference NC_024781.1. a) NC_024781.1_short_reads mapped using BBMap, (M1; M2) = 
(37; 0,60). b) NC_024781.1_primer mapped using BBMap. 
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Figure 13: Reference rnd_uniform. a) rnd_uniform_reads mapped using BBMap, (M1; M2) = (37; 
0,60). b) rnd_uniform_primer mapped using BBMap. c) Exponential distributed coverage was generated 
by stochastic simulation using the inversion method. d) The 26 primer pairs ([1, Supplementary Table 
8. PCR primers used in this study.]) are unevenly distributed across the entire reference genome. The 
primer positions correlate only weakly with areas of high nucleotide coverage, each comprising only a 
few nucleotides. e) The distribution of rnd_uniform_reads appear largely random. The variance of the 


exponential distribution considered agrees well with the trimmed empirical variance. 
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Figure 14: Reference rnd_wuhan. a) rnd_wuhan_reads mapped using BBMap, (M1; M2) = (87; 0.60). 


b) rnd_wuhan_primer mapped using BBMap. c) The coverage distribution shows a largely random 


distribution, comparable to Figure 13. d) The average read length is slightly above rnd_uniform (Figure 


13). This is due to the empirical distribution of nucleotides (A, T, C and G) used according to the 


reference for SARS-CoV-2 (MN908947.3). 
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Figure 15: Reference rnd_wh_mk_1. a) rnd_wh_mk_1_reads mapped using BBMap, (M1; M2) = (37; 


0.60). b) rnd_wh_mk_1_primer mapped using BBMap. c) The coverage distribution shows a largely 


random distribution, comparable to Figure 13. 
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Figure 16: Reference rnd_wh_mk_2. a) rnd_wh_mk_2_reads mapped using BBMap, (M1; M2) = (37; 
0.60). b) rnd_wh_mk_2_primer mapped using BBMap. c) The coverage distribution shows a largely 


random distribution, comparable to Figure 13. 
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Figure 17: Reference k141_5255. a) k141_5255 reads mapped with Bowtie2 using default settings. 
b) k141_5255_ primer mapped using BBMap. 
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Figure 18: Reference k141_ 12253. a) k141_12253 reads mapped with Bowtie2 using default settings. 
b) k141_ 12253 primer mapped using BBMap. 
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Figure 19: Reference k141_ 14387. a) k141_14387_reads mapped with Bowtie2 using default settings. 
b) k141_14387_primer mapped using BBMap. 
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Figure 20: Reference k141_7303. a) k141_ 7303 _ reads mapped with Bowtie2 using default settings. 
b) k141_7303_primer mapped using BBMap. 
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Figure 21: Reference k141_ 20796. a) k141_20796_ reads mapped with Bowtie2 using default settings. 


b) k141_20796_primer mapped using BBMap. 
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Figure 22: Reference k141_ 21803. a) k141_21803 reads mapped with Bowtie2 using default settings. 
b) k141_21803_primer mapped using BBMap. 


32 


Reference: MN908947.3 Reference: LC312715.1 


4.000 4.000 
3.500 3.500 
“8 3.000 8 3.000 
3 S 
S 2.500 & 2,500 
Pra = 
© 2.000 © 2.000 
2 1500 2 1500 
—° eo 
2 1.000 2 1.000 | | 
so = ANA 
" = i Mahia 
ooogcoogcqgqococoqoqodoceceaecooeoogoogcrcgog:?. Oo Oo i=} i=} io} l=} j=} So lo} i=} 
AN mM tF HOR DT HWO AN MATH OR DOH GS a N m Tt wn wo nn 0 
oS 8.9.8 4S) 3 SSg 


Read length Read length 
™@ Read length_MN908947.3 @@ Read length_LC312715.1 


Reference: NC_001653.2 


3.500 3.500 
2 3.000 +2 3.000 
© © 
® 2.500 ® 2.500 
. at 
© 2.000 © 2.000 
a 2 
2 1.500 2 1.500 
2 1.000 (| 2 1.000 
[ AN = AN 
one wll INMEE . a lin, 
i=} lo} Oo f=} Oo j=} l=} Oo j=} i=} i=} Oo Oo io} io} l=] Oo Oo i=} i=} Oo Oo So 
a N mn T+ w wo nw eo a i=} a N mo t+ wo a N m t+ w wo 
oS S94 SB 
Read length Read length 


© @ Read length_NC_001653.2 ™@@ Read length_AF266291.1 


e) f) 


Reference: rnd_uniform Reference: rnd_wuhan 


Number of reads 
bed 
[=] 
fo} 
oO 


i=} lo} So io} Oo io} i=} i=} oO Oo 
a i=} a N aa) t+ wn a 
a a 42 8 3 8 
Read length Read length 


@@ Read length_rnd_uniform* ™@ Read length_rnd_wuhan* 


Figure 23: a)-f) Mapped using BBMap, (M1; M2) = (37; 0,60). Analysis in Excel. 
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Figure 24: a)-f) Mapped using BBMap, (M1; M2) = (37; 0,60). Analysis in Excel. 
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Figure 25: a)-c) Mapped using BBMap, (M1; M2) = (37; 0,60). Analysis in Excel. 
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